A Longest Matching Segment Approach with Baysian Adaptation - Application to Noise-Robust Speaker Recognition
نویسندگان
چکیده
Temporal dynamics is an important feature of speech that distinguishes speech from noise, as well as distinguishing between different speakers. In this paper, we present an approach to extract long-range temporal dynamics of speech for text-independent speaker recognition. We aim to maximize the noise immunity arising from the distinct temporal dynamics of speech. The new approach achieves this by identifying the longest matching segments between the training data and test data for recognition. Additionally, the new approach combines Bayesian adaptation, multicondition training and missingfeature theory to further advance the ability to model noisy speech. Experiments have been conducted on the NIST 2002 SRE database in the presence of various types of noise including fast-varying song and music. The new approach has shown improved performance over conventional noise-robust techniques.
منابع مشابه
A longest matching segment approach for text-independent speaker recognition
We describe a new approach for segment-based speaker recognition, given text-independent training and test data. We assume that utterances from the same speaker have more and longer matching acoustic segments, compared to utterances from different speakers. Therefore, we identify the longest matching segments, at each frame location, between the training and test utterances, and base recognitio...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker and Noise Factorisation for Robust Speech Recognition
Speech recognition systems need to operate in a wide range of conditions. Thus they should be robust to extrinsic variability caused by various acoustic factors, for example speaker differences, transmission channel and background noise. For many scenarios, multiple factors simultaneously impact the underlying “clean” speech signal. This paper examines techniques to handle both speaker and back...
متن کاملImproved Jacobian Adaptation for Robust Speaker Verification
Jacobian Adaptation (JA) has been successfully used in Automatic Speech Recognition (ASR) systems to adapt the acoustic models from the training to the testing noise conditions. In this work we present an improvement of JA for speaker verification, where a specific training noise reference is estimated for each speaker model. The new proposal, which will be referred to as Model-dependent Noise ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011